Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 5570 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.0 MiB |
| Average record size in memory | 189.9 B |
Variable types
| Numeric | 13 |
|---|---|
| Categorical | 1 |
Município has a high cardinality: 5570 distinct values | High cardinality |
CV_HEPatite_BB is highly correlated with CV_HIB and 9 other fields | High correlation |
CV_HIB is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_DPT is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_POLIO is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_ROTA is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_PNEMO is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_MnCC is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_SCR1 is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_SCR2 is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_Varicela is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
CV_HEPatite_A is highly correlated with CV_HEPatite_BB and 9 other fields | High correlation |
Município is uniformly distributed | Uniform |
COD has unique values | Unique |
Município has unique values | Unique |
CV_BCG has 177 (3.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-09 00:41:13.926584 |
|---|---|
| Analysis finished | 2022-11-09 00:41:42.031648 |
| Duration | 28.11 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 5570 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 325358.6278 |
| Minimum | 110001 |
|---|---|
| Maximum | 530010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 110001 |
|---|---|
| 5-th percentile | 150777.25 |
| Q1 | 251212.5 |
| median | 314627.5 |
| Q3 | 411918.75 |
| 95-th percentile | 510729.55 |
| Maximum | 530010 |
| Range | 420009 |
| Interquartile range (IQR) | 160706.25 |
Descriptive statistics
| Standard deviation | 98491.03388 |
|---|---|
| Coefficient of variation (CV) | 0.3027152977 |
| Kurtosis | -0.5258091553 |
| Mean | 325358.6278 |
| Median Absolute Deviation (MAD) | 74152.5 |
| Skewness | 0.1213411839 |
| Sum | 1812247557 |
| Variance | 9700483754 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 110001 | 1 | < 0.1% |
| 353970 | 1 | < 0.1% |
| 354040 | 1 | < 0.1% |
| 354030 | 1 | < 0.1% |
| 354025 | 1 | < 0.1% |
| 354020 | 1 | < 0.1% |
| 354010 | 1 | < 0.1% |
| 354000 | 1 | < 0.1% |
| 353990 | 1 | < 0.1% |
| 353980 | 1 | < 0.1% |
| Other values (5560) | 5560 |
| Value | Count | Frequency (%) |
| 110001 | 1 | |
| 110002 | 1 | |
| 110003 | 1 | |
| 110004 | 1 | |
| 110005 | 1 | |
| 110006 | 1 | |
| 110007 | 1 | |
| 110008 | 1 | |
| 110009 | 1 | |
| 110010 | 1 |
| Value | Count | Frequency (%) |
| 530010 | 1 | |
| 522230 | 1 | |
| 522220 | 1 | |
| 522205 | 1 | |
| 522200 | 1 | |
| 522190 | 1 | |
| 522185 | 1 | |
| 522180 | 1 | |
| 522170 | 1 | |
| 522160 | 1 |
| Distinct | 5570 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 467.3 KiB |
| 110001 Alta Floresta D'Oeste | 1 |
|---|---|
| 353970 Platina | 1 |
| 354040 Populina | 1 |
| 354030 Pontes Gestal | 1 |
| 354025 Pontalinda | 1 |
| Other values (5565) |
Length
| Max length | 39 |
|---|---|
| Median length | 34 |
| Mean length | 18.61059246 |
| Min length | 10 |
Characters and Unicode
| Total characters | 103661 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 5570 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 110001 Alta Floresta D'Oeste |
|---|---|
| 2nd row | 110002 Ariquemes |
| 3rd row | 110003 Cabixi |
| 4th row | 110004 Cacoal |
| 5th row | 110005 Cerejeiras |
Common Values
| Value | Count | Frequency (%) |
| 110001 Alta Floresta D'Oeste | 1 | < 0.1% |
| 353970 Platina | 1 | < 0.1% |
| 354040 Populina | 1 | < 0.1% |
| 354030 Pontes Gestal | 1 | < 0.1% |
| 354025 Pontalinda | 1 | < 0.1% |
| 354020 Pontal | 1 | < 0.1% |
| 354010 Pongaí | 1 | < 0.1% |
| 354000 Pompéia | 1 | < 0.1% |
| 353990 Poloni | 1 | < 0.1% |
| 353980 Poá | 1 | < 0.1% |
| Other values (5560) | 5560 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| do | 756 | 4.8% |
| são | 364 | 2.3% |
| de | 302 | 1.9% |
| santa | 161 | 1.0% |
| da | 143 | 0.9% |
| nova | 135 | 0.9% |
| sul | 115 | 0.7% |
| rio | 94 | 0.6% |
| dos | 73 | 0.5% |
| josé | 70 | 0.4% |
| Other values (9533) | 13640 |
Most occurring characters
| Value | Count | Frequency (%) |
| 10283 | 9.9% | |
| a | 8791 | 8.5% |
| 0 | 8160 | 7.9% |
| o | 5961 | 5.8% |
| 1 | 4774 | 4.6% |
| 2 | 4591 | 4.4% |
| r | 4532 | 4.4% |
| i | 4388 | 4.2% |
| 3 | 4106 | 4.0% |
| e | 3764 | 3.6% |
| Other values (70) | 44311 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 50872 | |
| Decimal Number | 33420 | |
| Space Separator | 10283 | 9.9% |
| Uppercase Letter | 9010 | 8.7% |
| Other Punctuation | 47 | < 0.1% |
| Dash Punctuation | 29 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 8791 | |
| o | 5961 | |
| r | 4532 | |
| i | 4388 | |
| e | 3764 | 7.4% |
| n | 3196 | 6.3% |
| d | 2553 | 5.0% |
| s | 2423 | 4.8% |
| t | 2293 | 4.5% |
| u | 2155 | 4.2% |
| Other values (27) | 10816 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1137 | |
| C | 970 | |
| P | 911 | 10.1% |
| M | 721 | 8.0% |
| A | 698 | 7.7% |
| B | 602 | 6.7% |
| I | 475 | 5.3% |
| J | 405 | 4.5% |
| G | 391 | 4.3% |
| R | 367 | 4.1% |
| Other values (20) | 2333 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 8160 | |
| 1 | 4774 | |
| 2 | 4591 | |
| 3 | 4106 | |
| 5 | 3654 | |
| 4 | 2781 | 8.3% |
| 7 | 1470 | 4.4% |
| 6 | 1422 | 4.3% |
| 9 | 1382 | 4.1% |
| 8 | 1080 | 3.2% |
Space Separator
| Value | Count | Frequency (%) |
| 10283 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 47 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 29 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59882 | |
| Common | 43779 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 8791 | |
| o | 5961 | 10.0% |
| r | 4532 | 7.6% |
| i | 4388 | 7.3% |
| e | 3764 | 6.3% |
| n | 3196 | 5.3% |
| d | 2553 | 4.3% |
| s | 2423 | 4.0% |
| t | 2293 | 3.8% |
| u | 2155 | 3.6% |
| Other values (57) | 19826 |
Common
| Value | Count | Frequency (%) |
| 10283 | ||
| 0 | 8160 | |
| 1 | 4774 | |
| 2 | 4591 | |
| 3 | 4106 | 9.4% |
| 5 | 3654 | 8.3% |
| 4 | 2781 | 6.4% |
| 7 | 1470 | 3.4% |
| 6 | 1422 | 3.2% |
| 9 | 1382 | 3.2% |
| Other values (3) | 1156 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100822 | |
| None | 2839 | 2.7% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 10283 | 10.2% | |
| a | 8791 | 8.7% |
| 0 | 8160 | 8.1% |
| o | 5961 | 5.9% |
| 1 | 4774 | 4.7% |
| 2 | 4591 | 4.6% |
| r | 4532 | 4.5% |
| i | 4388 | 4.4% |
| 3 | 4106 | 4.1% |
| e | 3764 | 3.7% |
| Other values (54) | 41472 |
None
| Value | Count | Frequency (%) |
| ã | 794 | |
| á | 393 | |
| í | 336 | |
| é | 317 | 11.2% |
| ç | 268 | 9.4% |
| ó | 243 | 8.6% |
| â | 161 | 5.7% |
| ú | 101 | 3.6% |
| ô | 71 | 2.5% |
| ê | 70 | 2.5% |
| Other values (6) | 85 | 3.0% |
| Distinct | 3660 |
|---|---|
| Distinct (%) | 65.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 73.32931418 |
| Minimum | 0 |
|---|---|
| Maximum | 798.5 |
| Zeros | 177 |
| Zeros (%) | 3.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5.0645 |
| Q1 | 45.83 |
| median | 80.175 |
| Q3 | 97.4625 |
| 95-th percentile | 125.8895 |
| Maximum | 798.5 |
| Range | 798.5 |
| Interquartile range (IQR) | 51.6325 |
Descriptive statistics
| Standard deviation | 42.3616947 |
|---|---|
| Coefficient of variation (CV) | 0.5776911345 |
| Kurtosis | 34.29976691 |
| Mean | 73.32931418 |
| Median Absolute Deviation (MAD) | 22.145 |
| Skewness | 2.489903005 |
| Sum | 408444.28 |
| Variance | 1794.513178 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 177 | 3.2% |
| 100 | 100 | 1.8% |
| 50 | 19 | 0.3% |
| 75 | 16 | 0.3% |
| 83.33 | 14 | 0.3% |
| 25 | 13 | 0.2% |
| 66.67 | 12 | 0.2% |
| 90.91 | 10 | 0.2% |
| 94.44 | 10 | 0.2% |
| 80 | 10 | 0.2% |
| Other values (3650) | 5189 |
| Value | Count | Frequency (%) |
| 0 | 177 | |
| 0.45 | 1 | < 0.1% |
| 0.48 | 1 | < 0.1% |
| 0.55 | 2 | < 0.1% |
| 0.67 | 1 | < 0.1% |
| 0.7 | 2 | < 0.1% |
| 0.79 | 2 | < 0.1% |
| 0.81 | 1 | < 0.1% |
| 0.82 | 1 | < 0.1% |
| 0.84 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 798.5 | 1 | |
| 727.3 | 1 | |
| 572.66 | 1 | |
| 548.77 | 1 | |
| 464.74 | 1 | |
| 431.91 | 1 | |
| 391.94 | 1 | |
| 268.16 | 1 | |
| 265.94 | 1 | |
| 262.35 | 1 |
| Distinct | 3422 |
|---|---|
| Distinct (%) | 61.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 85.21601975 |
| Minimum | 0 |
|---|---|
| Maximum | 361.54 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 46.469 |
| Q1 | 67.7075 |
| median | 81.39 |
| Q3 | 97.73 |
| 95-th percentile | 134.78 |
| Maximum | 361.54 |
| Range | 361.54 |
| Interquartile range (IQR) | 30.0225 |
Descriptive statistics
| Standard deviation | 29.34078907 |
|---|---|
| Coefficient of variation (CV) | 0.3443107195 |
| Kurtosis | 8.500584716 |
| Mean | 85.21601975 |
| Median Absolute Deviation (MAD) | 14.925 |
| Skewness | 1.744074483 |
| Sum | 474653.23 |
| Variance | 860.8819035 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 84 | 1.5% |
| 75 | 25 | 0.4% |
| 77.78 | 19 | 0.3% |
| 80 | 18 | 0.3% |
| 66.67 | 17 | 0.3% |
| 83.33 | 15 | 0.3% |
| 71.43 | 14 | 0.3% |
| 84.62 | 14 | 0.3% |
| 93.75 | 13 | 0.2% |
| 90 | 13 | 0.2% |
| Other values (3412) | 5338 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1.17 | 1 | |
| 1.59 | 1 | |
| 4.35 | 1 | |
| 6.45 | 1 | |
| 8.62 | 1 | |
| 9.44 | 1 | |
| 9.9 | 1 | |
| 9.91 | 1 | |
| 11.63 | 1 |
| Value | Count | Frequency (%) |
| 361.54 | 1 | |
| 350 | 1 | |
| 340.74 | 1 | |
| 316.67 | 1 | |
| 277.78 | 1 | |
| 272.22 | 1 | |
| 265.12 | 1 | |
| 264.29 | 1 | |
| 263.64 | 1 | |
| 261.11 | 1 |
| Distinct | 3431 |
|---|---|
| Distinct (%) | 61.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 85.31403052 |
| Minimum | 0 |
|---|---|
| Maximum | 361.54 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 46.5115 |
| Q1 | 67.8325 |
| median | 81.43 |
| Q3 | 97.92 |
| 95-th percentile | 134.857 |
| Maximum | 361.54 |
| Range | 361.54 |
| Interquartile range (IQR) | 30.0875 |
Descriptive statistics
| Standard deviation | 29.34923618 |
|---|---|
| Coefficient of variation (CV) | 0.3440141791 |
| Kurtosis | 8.476601842 |
| Mean | 85.31403052 |
| Median Absolute Deviation (MAD) | 14.91 |
| Skewness | 1.740420109 |
| Sum | 475199.15 |
| Variance | 861.3776641 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 76 | 1.4% |
| 75 | 25 | 0.4% |
| 80 | 18 | 0.3% |
| 77.78 | 17 | 0.3% |
| 66.67 | 17 | 0.3% |
| 83.33 | 15 | 0.3% |
| 71.43 | 15 | 0.3% |
| 84.62 | 14 | 0.3% |
| 85.71 | 14 | 0.3% |
| 93.75 | 13 | 0.2% |
| Other values (3421) | 5346 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1.17 | 1 | |
| 1.59 | 1 | |
| 4.35 | 1 | |
| 6.45 | 1 | |
| 8.62 | 1 | |
| 9.44 | 1 | |
| 9.9 | 1 | |
| 9.91 | 1 | |
| 11.63 | 1 |
| Value | Count | Frequency (%) |
| 361.54 | 1 | |
| 350 | 1 | |
| 340.74 | 1 | |
| 316.67 | 1 | |
| 277.78 | 1 | |
| 272.22 | 1 | |
| 265.12 | 1 | |
| 264.29 | 1 | |
| 263.64 | 1 | |
| 261.11 | 1 |
| Distinct | 3431 |
|---|---|
| Distinct (%) | 61.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 85.49964632 |
| Minimum | 0 |
|---|---|
| Maximum | 361.54 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 46.679 |
| Q1 | 67.925 |
| median | 81.7 |
| Q3 | 98.1675 |
| 95-th percentile | 135 |
| Maximum | 361.54 |
| Range | 361.54 |
| Interquartile range (IQR) | 30.2425 |
Descriptive statistics
| Standard deviation | 29.39396865 |
|---|---|
| Coefficient of variation (CV) | 0.3437905292 |
| Kurtosis | 8.490359389 |
| Mean | 85.49964632 |
| Median Absolute Deviation (MAD) | 14.885 |
| Skewness | 1.746049892 |
| Sum | 476233.03 |
| Variance | 864.0053932 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 73 | 1.3% |
| 75 | 24 | 0.4% |
| 80 | 18 | 0.3% |
| 77.78 | 16 | 0.3% |
| 66.67 | 16 | 0.3% |
| 83.33 | 16 | 0.3% |
| 85.71 | 13 | 0.2% |
| 71.43 | 13 | 0.2% |
| 69.23 | 13 | 0.2% |
| 91.67 | 12 | 0.2% |
| Other values (3421) | 5356 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1.17 | 1 | |
| 1.59 | 1 | |
| 4.35 | 1 | |
| 6.45 | 1 | |
| 8.62 | 1 | |
| 9.44 | 1 | |
| 9.91 | 1 | |
| 10.1 | 1 | |
| 11.63 | 1 |
| Value | Count | Frequency (%) |
| 361.54 | 1 | |
| 350 | 1 | |
| 340.74 | 1 | |
| 316.67 | 1 | |
| 277.78 | 1 | |
| 272.22 | 1 | |
| 265.12 | 1 | |
| 264.29 | 1 | |
| 263.64 | 2 | |
| 261.11 | 1 |
| Distinct | 3362 |
|---|---|
| Distinct (%) | 60.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 97.47475763 |
| Minimum | 0 |
|---|---|
| Maximum | 715.38 |
| Zeros | 3 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 59.7445 |
| Q1 | 82.425 |
| median | 94.29 |
| Q3 | 107.815 |
| 95-th percentile | 145.088 |
| Maximum | 715.38 |
| Range | 715.38 |
| Interquartile range (IQR) | 25.39 |
Descriptive statistics
| Standard deviation | 29.39536708 |
|---|---|
| Coefficient of variation (CV) | 0.3015690195 |
| Kurtosis | 42.60275208 |
| Mean | 97.47475763 |
| Median Absolute Deviation (MAD) | 12.68 |
| Skewness | 3.217633832 |
| Sum | 542934.4 |
| Variance | 864.0876058 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 133 | 2.4% |
| 133.33 | 20 | 0.4% |
| 80 | 18 | 0.3% |
| 75 | 18 | 0.3% |
| 128.57 | 17 | 0.3% |
| 125 | 15 | 0.3% |
| 85.71 | 14 | 0.3% |
| 114.29 | 13 | 0.2% |
| 116.67 | 13 | 0.2% |
| 83.33 | 13 | 0.2% |
| Other values (3352) | 5296 |
| Value | Count | Frequency (%) |
| 0 | 3 | |
| 4.35 | 1 | < 0.1% |
| 13.51 | 1 | < 0.1% |
| 13.58 | 1 | < 0.1% |
| 13.74 | 1 | < 0.1% |
| 13.79 | 1 | < 0.1% |
| 15.62 | 1 | < 0.1% |
| 16.07 | 1 | < 0.1% |
| 16.11 | 1 | < 0.1% |
| 16.13 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 715.38 | 1 | |
| 375 | 1 | |
| 360 | 1 | |
| 337.04 | 1 | |
| 304.12 | 1 | |
| 285.75 | 1 | |
| 275 | 1 | |
| 272.22 | 1 | |
| 270 | 1 | |
| 266.67 | 2 |
| Distinct | 3258 |
|---|---|
| Distinct (%) | 58.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 97.8920395 |
| Minimum | 0 |
|---|---|
| Maximum | 425 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 62.63 |
| Q1 | 84.7325 |
| median | 95.62 |
| Q3 | 107.9275 |
| 95-th percentile | 140 |
| Maximum | 425 |
| Range | 425 |
| Interquartile range (IQR) | 23.195 |
Descriptive statistics
| Standard deviation | 25.70106644 |
|---|---|
| Coefficient of variation (CV) | 0.2625450095 |
| Kurtosis | 11.32207726 |
| Mean | 97.8920395 |
| Median Absolute Deviation (MAD) | 11.52 |
| Skewness | 1.670716931 |
| Sum | 545258.66 |
| Variance | 660.5448163 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 135 | 2.4% |
| 114.29 | 19 | 0.3% |
| 111.11 | 16 | 0.3% |
| 120 | 16 | 0.3% |
| 116.67 | 16 | 0.3% |
| 87.5 | 15 | 0.3% |
| 91.67 | 14 | 0.3% |
| 133.33 | 13 | 0.2% |
| 80 | 12 | 0.2% |
| 75 | 12 | 0.2% |
| Other values (3248) | 5302 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1.59 | 1 | |
| 11.26 | 1 | |
| 12.07 | 1 | |
| 12.9 | 1 | |
| 13.16 | 1 | |
| 13.21 | 1 | |
| 16.98 | 1 | |
| 18.26 | 1 | |
| 19.05 | 1 |
| Value | Count | Frequency (%) |
| 425 | 1 | |
| 311.11 | 2 | |
| 291.67 | 1 | |
| 276.92 | 1 | |
| 270 | 1 | |
| 266.67 | 1 | |
| 263.16 | 1 | |
| 250 | 1 | |
| 245.45 | 1 | |
| 241.3 | 1 |
| Distinct | 3256 |
|---|---|
| Distinct (%) | 58.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101.304325 |
| Minimum | 0 |
|---|---|
| Maximum | 433.33 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 67.07 |
| Q1 | 88.135 |
| median | 98.425 |
| Q3 | 111.11 |
| 95-th percentile | 144.404 |
| Maximum | 433.33 |
| Range | 433.33 |
| Interquartile range (IQR) | 22.975 |
Descriptive statistics
| Standard deviation | 25.94379161 |
|---|---|
| Coefficient of variation (CV) | 0.2560975715 |
| Kurtosis | 12.33643329 |
| Mean | 101.304325 |
| Median Absolute Deviation (MAD) | 11.34 |
| Skewness | 1.804307687 |
| Sum | 564265.09 |
| Variance | 673.080323 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 130 | 2.3% |
| 116.67 | 20 | 0.4% |
| 111.11 | 19 | 0.3% |
| 133.33 | 18 | 0.3% |
| 110 | 17 | 0.3% |
| 125 | 15 | 0.3% |
| 112.5 | 15 | 0.3% |
| 114.29 | 14 | 0.3% |
| 91.67 | 13 | 0.2% |
| 95.45 | 12 | 0.2% |
| Other values (3246) | 5297 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 3.17 | 1 | |
| 12.9 | 1 | |
| 13.06 | 1 | |
| 14.72 | 1 | |
| 16.98 | 1 | |
| 18.97 | 1 | |
| 19.19 | 1 | |
| 19.72 | 1 | |
| 21.93 | 1 |
| Value | Count | Frequency (%) |
| 433.33 | 1 | |
| 344.44 | 1 | |
| 323.08 | 1 | |
| 305.56 | 1 | |
| 291.67 | 1 | |
| 270.37 | 1 | |
| 265.79 | 1 | |
| 260 | 1 | |
| 258.44 | 1 | |
| 250 | 1 |
| Distinct | 3299 |
|---|---|
| Distinct (%) | 59.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100.2620682 |
| Minimum | 0 |
|---|---|
| Maximum | 375 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 64.128 |
| Q1 | 86.4125 |
| median | 97.695 |
| Q3 | 110.53 |
| 95-th percentile | 145.607 |
| Maximum | 375 |
| Range | 375 |
| Interquartile range (IQR) | 24.1175 |
Descriptive statistics
| Standard deviation | 26.87693061 |
|---|---|
| Coefficient of variation (CV) | 0.2680667883 |
| Kurtosis | 9.675608158 |
| Mean | 100.2620682 |
| Median Absolute Deviation (MAD) | 11.935 |
| Skewness | 1.614179705 |
| Sum | 558459.72 |
| Variance | 722.3693992 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 116 | 2.1% |
| 125 | 20 | 0.4% |
| 116.67 | 17 | 0.3% |
| 111.11 | 16 | 0.3% |
| 112.5 | 15 | 0.3% |
| 114.29 | 15 | 0.3% |
| 93.75 | 14 | 0.3% |
| 120 | 13 | 0.2% |
| 83.33 | 13 | 0.2% |
| 106.25 | 11 | 0.2% |
| Other values (3289) | 5320 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1.59 | 1 | |
| 7.25 | 1 | |
| 13.06 | 1 | |
| 16.13 | 1 | |
| 16.6 | 1 | |
| 17.22 | 1 | |
| 18.33 | 1 | |
| 18.97 | 1 | |
| 19.01 | 1 |
| Value | Count | Frequency (%) |
| 375 | 1 | |
| 362.96 | 1 | |
| 338.46 | 1 | |
| 304.17 | 1 | |
| 300 | 1 | |
| 277.78 | 1 | |
| 270.37 | 1 | |
| 268.18 | 1 | |
| 266.23 | 1 | |
| 254.35 | 1 |
| Distinct | 3245 |
|---|---|
| Distinct (%) | 58.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 102.9138025 |
| Minimum | 0 |
|---|---|
| Maximum | 367.74 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 68.797 |
| Q1 | 89.87 |
| median | 100 |
| Q3 | 112.105 |
| 95-th percentile | 148.0855 |
| Maximum | 367.74 |
| Range | 367.74 |
| Interquartile range (IQR) | 22.235 |
Descriptive statistics
| Standard deviation | 26.12121151 |
|---|---|
| Coefficient of variation (CV) | 0.2538164063 |
| Kurtosis | 10.80717523 |
| Mean | 102.9138025 |
| Median Absolute Deviation (MAD) | 11.11 |
| Skewness | 1.741423626 |
| Sum | 573229.88 |
| Variance | 682.3176909 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 153 | 2.7% |
| 133.33 | 23 | 0.4% |
| 120 | 21 | 0.4% |
| 110 | 16 | 0.3% |
| 90.91 | 16 | 0.3% |
| 90 | 16 | 0.3% |
| 116.67 | 15 | 0.3% |
| 92.86 | 13 | 0.2% |
| 105.56 | 11 | 0.2% |
| 128.57 | 11 | 0.2% |
| Other values (3235) | 5275 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 2.82 | 1 | |
| 4.6 | 1 | |
| 8.77 | 1 | |
| 8.88 | 1 | |
| 11.11 | 1 | |
| 17.95 | 1 | |
| 18.64 | 1 | |
| 19.26 | 1 | |
| 21.88 | 1 |
| Value | Count | Frequency (%) |
| 367.74 | 1 | |
| 354.55 | 1 | |
| 345.41 | 1 | |
| 316 | 2 | |
| 300 | 1 | |
| 290 | 1 | |
| 280 | 1 | |
| 272.22 | 1 | |
| 268.84 | 1 | |
| 260.87 | 1 |
| Distinct | 3387 |
|---|---|
| Distinct (%) | 60.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 92.27414722 |
| Minimum | 0 |
|---|---|
| Maximum | 328 |
| Zeros | 3 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 77.44 |
| median | 91.16 |
| Q3 | 105.1525 |
| 95-th percentile | 137.795 |
| Maximum | 328 |
| Range | 328 |
| Interquartile range (IQR) | 27.7125 |
Descriptive statistics
| Standard deviation | 27.20906072 |
|---|---|
| Coefficient of variation (CV) | 0.2948719825 |
| Kurtosis | 5.097385318 |
| Mean | 92.27414722 |
| Median Absolute Deviation (MAD) | 13.84 |
| Skewness | 0.9053714702 |
| Sum | 513967 |
| Variance | 740.3329855 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 116 | 2.1% |
| 75 | 20 | 0.4% |
| 111.11 | 17 | 0.3% |
| 85.71 | 17 | 0.3% |
| 120 | 15 | 0.3% |
| 112.5 | 15 | 0.3% |
| 116.67 | 14 | 0.3% |
| 125 | 14 | 0.3% |
| 83.33 | 13 | 0.2% |
| 50 | 12 | 0.2% |
| Other values (3377) | 5317 |
| Value | Count | Frequency (%) |
| 0 | 3 | |
| 1.89 | 1 | < 0.1% |
| 2.7 | 1 | < 0.1% |
| 4.55 | 1 | < 0.1% |
| 4.6 | 1 | < 0.1% |
| 4.96 | 1 | < 0.1% |
| 5.07 | 1 | < 0.1% |
| 6.78 | 1 | < 0.1% |
| 7.02 | 1 | < 0.1% |
| 7.17 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 328 | 1 | |
| 322.73 | 1 | |
| 288 | 1 | |
| 257.14 | 1 | |
| 244.64 | 1 | |
| 243.64 | 2 | |
| 240 | 1 | |
| 230.77 | 1 | |
| 230 | 2 | |
| 225.64 | 1 |
| Distinct | 3425 |
|---|---|
| Distinct (%) | 61.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 92.48526032 |
| Minimum | 0 |
|---|---|
| Maximum | 328 |
| Zeros | 5 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 50.5435 |
| Q1 | 77.8325 |
| median | 91.365 |
| Q3 | 105.015 |
| 95-th percentile | 137.5 |
| Maximum | 328 |
| Range | 328 |
| Interquartile range (IQR) | 27.1825 |
Descriptive statistics
| Standard deviation | 27.06415219 |
|---|---|
| Coefficient of variation (CV) | 0.2926320593 |
| Kurtosis | 4.955909596 |
| Mean | 92.48526032 |
| Median Absolute Deviation (MAD) | 13.62 |
| Skewness | 0.8816134481 |
| Sum | 515142.9 |
| Variance | 732.4683335 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 123 | 2.2% |
| 125 | 19 | 0.3% |
| 80 | 16 | 0.3% |
| 133.33 | 14 | 0.3% |
| 85.71 | 14 | 0.3% |
| 75 | 14 | 0.3% |
| 120 | 13 | 0.2% |
| 84.62 | 12 | 0.2% |
| 88.89 | 12 | 0.2% |
| 105.56 | 11 | 0.2% |
| Other values (3415) | 5322 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 3.7 | 1 | < 0.1% |
| 4.18 | 1 | < 0.1% |
| 4.39 | 1 | < 0.1% |
| 5.41 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 7.02 | 1 | < 0.1% |
| 8.48 | 1 | < 0.1% |
| 8.89 | 1 | < 0.1% |
| 9.47 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 328 | 1 | |
| 327.27 | 1 | |
| 280 | 1 | |
| 253.06 | 1 | |
| 245.45 | 1 | |
| 240 | 1 | |
| 230.77 | 1 | |
| 230.36 | 1 | |
| 225.85 | 1 | |
| 219.57 | 1 |
| Distinct | 3730 |
|---|---|
| Distinct (%) | 67.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 190.197693 |
| Minimum | 0 |
|---|---|
| Maximum | 672.73 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 111.367 |
| Q1 | 162.26 |
| median | 187.195 |
| Q3 | 212.77 |
| 95-th percentile | 277.6765 |
| Maximum | 672.73 |
| Range | 672.73 |
| Interquartile range (IQR) | 50.51 |
Descriptive statistics
| Standard deviation | 52.39975167 |
|---|---|
| Coefficient of variation (CV) | 0.2755015103 |
| Kurtosis | 6.277636744 |
| Mean | 190.197693 |
| Median Absolute Deviation (MAD) | 25.26 |
| Skewness | 1.095050737 |
| Sum | 1059401.15 |
| Variance | 2745.733975 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 200 | 113 | 2.0% |
| 177.78 | 18 | 0.3% |
| 187.5 | 17 | 0.3% |
| 166.67 | 15 | 0.3% |
| 222.22 | 14 | 0.3% |
| 266.67 | 14 | 0.3% |
| 171.43 | 14 | 0.3% |
| 160 | 14 | 0.3% |
| 216.67 | 13 | 0.2% |
| 250 | 13 | 0.2% |
| Other values (3720) | 5325 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 2.82 | 1 | |
| 7.41 | 1 | |
| 8.62 | 1 | |
| 10.15 | 1 | |
| 15.06 | 1 | |
| 17.54 | 1 | |
| 21.62 | 1 | |
| 21.88 | 1 | |
| 25.14 | 1 |
| Value | Count | Frequency (%) |
| 672.73 | 1 | |
| 632.26 | 1 | |
| 598.55 | 1 | |
| 568 | 1 | |
| 516.67 | 1 | |
| 512 | 1 | |
| 510.2 | 1 | |
| 477.78 | 1 | |
| 476.92 | 1 | |
| 467.86 | 1 |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| COD | Município | CV_BCG | CV_HEPatite_BB | CV_HIB | CV_DPT | CV_POLIO | CV_ROTA | CV_PNEMO | CV_MnCC | CV_SCR1 | CV_SCR2 | CV_Varicela | CV_HEPatite_A | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 110001 | 110001 Alta Floresta D'Oeste | 90.19 | 141.69 | 141.69 | 141.69 | 147.14 | 99.46 | 156.68 | 104.36 | 123.46 | 88.64 | 87.41 | 173.33 |
| 1 | 110002 | 110002 Ariquemes | 112.32 | 97.92 | 97.92 | 97.92 | 105.90 | 93.41 | 104.80 | 96.36 | 109.48 | 84.30 | 84.42 | 169.81 |
| 2 | 110003 | 110003 Cabixi | 0.00 | 152.50 | 152.50 | 152.50 | 151.25 | 138.75 | 143.75 | 148.75 | 166.67 | 165.43 | 180.25 | 355.56 |
| 3 | 110004 | 110004 Cacoal | 119.21 | 63.61 | 63.76 | 63.90 | 80.49 | 85.01 | 86.03 | 86.39 | 104.63 | 85.21 | 84.51 | 169.31 |
| 4 | 110005 | 110005 Cerejeiras | 42.62 | 137.55 | 138.40 | 137.55 | 137.13 | 125.74 | 127.00 | 137.97 | 129.23 | 120.00 | 123.08 | 250.77 |
| 5 | 110006 | 110006 Colorado do Oeste | 5.61 | 118.69 | 118.69 | 118.69 | 114.95 | 84.58 | 119.16 | 90.19 | 108.15 | 62.66 | 61.80 | 132.19 |
| 6 | 110007 | 110007 Corumbiara | 22.05 | 101.57 | 101.57 | 104.72 | 102.36 | 97.64 | 99.21 | 103.94 | 96.09 | 87.50 | 89.06 | 187.50 |
| 7 | 110008 | 110008 Costa Marques | 108.15 | 127.90 | 127.90 | 127.90 | 129.18 | 93.56 | 122.32 | 132.62 | 119.26 | 88.93 | 88.93 | 171.31 |
| 8 | 110009 | 110009 Espigão D'Oeste | 84.89 | 115.33 | 115.33 | 115.33 | 124.44 | 100.00 | 123.56 | 98.89 | 119.15 | 92.34 | 90.85 | 182.13 |
| 9 | 110010 | 110010 Guajará-Mirim | 88.35 | 83.48 | 83.48 | 84.12 | 100.90 | 80.41 | 139.69 | 85.79 | 152.10 | 68.58 | 63.38 | 134.73 |
Last rows
| COD | Município | CV_BCG | CV_HEPatite_BB | CV_HIB | CV_DPT | CV_POLIO | CV_ROTA | CV_PNEMO | CV_MnCC | CV_SCR1 | CV_SCR2 | CV_Varicela | CV_HEPatite_A | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5560 | 522160 | 522160 Uruaçu | 90.82 | 73.98 | 73.98 | 73.98 | 83.16 | 84.52 | 89.29 | 90.48 | 96.92 | 75.91 | 77.90 | 170.29 |
| 5561 | 522170 | 522170 Uruana | 68.42 | 74.34 | 75.66 | 75.66 | 98.68 | 100.66 | 105.92 | 112.50 | 97.28 | 93.20 | 92.52 | 176.87 |
| 5562 | 522180 | 522180 Urutaí | 107.69 | 192.31 | 192.31 | 192.31 | 215.38 | 192.31 | 192.31 | 176.92 | 147.83 | 126.09 | 121.74 | 278.26 |
| 5563 | 522185 | 522185 Valparaíso de Goiás | 80.47 | 56.12 | 56.12 | 56.20 | 80.35 | 79.08 | 73.58 | 79.35 | 85.82 | 72.13 | 64.42 | 165.22 |
| 5564 | 522190 | 522190 Varjão | 84.38 | 103.13 | 103.13 | 103.13 | 121.88 | 125.00 | 125.00 | 146.88 | 162.07 | 134.48 | 134.48 | 248.28 |
| 5565 | 522200 | 522200 Vianópolis | 98.47 | 84.18 | 85.71 | 85.71 | 101.02 | 104.59 | 105.10 | 105.61 | 106.80 | 91.26 | 90.78 | 189.32 |
| 5566 | 522205 | 522205 Vicentinópolis | 98.21 | 75.89 | 75.89 | 76.79 | 93.75 | 103.57 | 99.11 | 93.75 | 91.94 | 84.68 | 85.48 | 167.74 |
| 5567 | 522220 | 522220 Vila Boa | 50.00 | 116.00 | 116.00 | 116.00 | 114.00 | 110.00 | 118.00 | 112.00 | 130.36 | 103.57 | 101.79 | 200.00 |
| 5568 | 522230 | 522230 Vila Propício | 94.37 | 54.93 | 54.93 | 54.93 | 73.24 | 84.51 | 95.77 | 76.06 | 126.92 | 76.92 | 71.15 | 207.69 |
| 5569 | 530010 | 530010 Brasília | 96.98 | 73.66 | 74.40 | 74.66 | 88.64 | 88.87 | 92.41 | 89.52 | 86.58 | 90.06 | 88.88 | 179.54 |